Molecular mechanism of decision-making in glycosaminoglycan biosynthesis

Two major glycosaminoglycan types, heparan sulfate (HS) and chondroitin sulfate (CS), control many aspects of development and physiology in a type-specific manner. HS and CS are attached to core proteins via a common linker tetrasaccharide, but differ in their polymer backbones. How core proteins are specifically modified with HS or CS has been an enduring mystery. By reconstituting glycosaminoglycan biosynthesis in vitro, we establish that the CS-initiating N-acetylgalactosaminyltransferase CSGALNACT2 modifies all glycopeptide substrates equally, whereas the HS-initiating N-acetylglucosaminyltransferase EXTL3 is selective. Structure-function analysis reveals that acidic residues in the glycopeptide substrate and a basic exosite in EXTL3 are critical for specifying HS biosynthesis. Linker phosphorylation by the xylose kinase FAM20B accelerates linker synthesis and initiation of both HS and CS, but has no effect on the subsequent polymerisation of the backbone. Our results demonstrate that modification with CS occurs by default and must be overridden by EXTL3 to produce HS.


Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative. All studies must disclose on these points even when the disclosure is negative.

Study description
Research sample The crystal structure of apo EXTL3 has been deposited in the Protein Data Bank, accession code 8OG1. The crystal structure of EXTL3 with UDP and Mn2+ has been deposited in the Protein Data Bank, accession code 8OG4.
n/a n/a n/a n/a n/a Sample sizes were not predetermined based on statistical methods, but were chosen according to the standards of the field (three independent biological replicates for each condition).
No data were excluded in any of the experiments or analyses shown.
Enzyme kinetic experiments, in-gel fluorescence experiments, and Western blot experiments were done with three independent biological replicates for each condition. Most HPLC experiments were done with at least two independent replicates, and always included appropriate negative and positive controls. For every experiment presented, the results were found to be reproducible.
The experiments were not randomised. In this study, related data resulted from biochemical assays on enzyme function and samples were not allocated to groups.
Investigators were not blinded. This is an enzymological study and blinding is not typically used in the field. Blinding is also not necessary because the results are quantitative and did not require subjective judgment or interpretation.
Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, quantitative experimental, mixed-methods case study).

nature portfolio | reporting summary
April 2023 Describe the sampling procedure (e.g. random, snowball, stratified, convenience). Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient. For qualitative data, please indicate whether data saturation was considered, and what criteria were used to decide that no further sampling was needed.
Provide details about the data collection procedure, including the instruments or devices used to record the data (e.g. pen and paper, computer, eye tracker, video or audio equipment) whether anyone was present besides the participant(s) and the researcher, and whether the researcher was blind to experimental condition and/or the study hypothesis during data collection.
Indicate the start and stop dates of data collection. If there is a gap between collection periods, state the dates for each sample cohort.
If no data were excluded from the analyses, state so OR if data were excluded, provide the exact number of exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.
State how many participants dropped out/declined participation and the reason(s) given OR provide response rate OR state that no participants dropped out/declined participation.
If participants were not allocated into experimental groups, state so OR describe how participants were allocated to groups, and if allocation was not random, describe how covariates were controlled.
Briefly describe the study. For quantitative data include treatment factors and interactions, design structure (e.g. factorial, nested, hierarchical), nature and number of experimental units and replicates.
Describe the research sample (e.g. a group of tagged Passer domesticus, all Stenocereus thurberi within Organ Pipe Cactus National Monument), and provide a rationale for the sample choice. When relevant, describe the organism taxa, source, sex, age range and any manipulations. State what population the sample is meant to represent when applicable. For studies involving existing datasets, describe the data and its source.
Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient.
Describe the data collection procedure, including who recorded the data and how.
Indicate the start and stop dates of data collection, noting the frequency and periodicity of sampling and providing a rationale for these choices. If there is a gap between collection periods, state the dates for each sample cohort. Specify the spatial scale from which the data are taken If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.
Describe the measures taken to verify the reproducibility of experimental findings. For each experiment, note whether any attempts to repeat the experiment failed OR state that all attempts to repeat the experiment were successful.
Describe how samples/organisms/participants were allocated into groups. If allocation was not random, describe how covariates were controlled. If this is not relevant to your study, explain why.
Describe the extent of blinding used during data acquisition and analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study.
Describe the study conditions for field work, providing relevant parameters (e.g. temperature, rainfall).
State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water depth).
Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and in compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing authority, the date of issue, and any identifying information). Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Ethics oversight
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Animals and other research organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research, and Sex and Gender in Research

Laboratory animals
Describe any disturbance caused by the study and how it was minimized.
The cell lines were obtained from the indicated commercial sources and used without further authentication.
The cell lines were not tested for mycoplasma contamination. The cell lines were used for the production of recombinant proteins, which were subsequently validated by SDS-PAGE, Western blotting, or enzyme activity.
Neither cell line is listed as a known misidentified cell line by the International Cell Line Authentication Committee.
n/a n/a n/a n/a n/a Note that full information on the approval of the study protocol must also be provided in the manuscript. Policy information about dual use research of concern

Hazards
Could the accidental, deliberate or reckless misuse of agents or technologies generated in the work, or the application of information presented in the manuscript, pose a threat to:

Experiments of concern
Does the work involve any of these experiments of concern: No Yes Novel plant genotypes Authentication n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a 6 nature portfolio | reporting summary

April 2023
ChIP-seq Data deposition Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks.

Data access links
May remain private before publication. The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).

Files in database submission
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.

Behavioral performance measures
For "Initial submission" or "Revised version" documents, provide reviewer access links. For your "Final submission" document, provide a link to the deposited data.
n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a